Recognition of Handwritten Chinese Text by Segmentation: A Segment-Annotation-Free Approach

نویسندگان

چکیده

Online and offline handwritten Chinese text recognition (HTCR) has been studied for decades. Early methods adopted oversegmentation-based strategies but suffered from low speed, insufficient accuracy, high cost of character segmentation annotations. Recently, segmentation-free based on connectionist temporal classification (CTC) attention mechanism, have dominated the field HCTR. However, people actually read by character, especially ideograms such as Chinese. This raises question: are really best solution to HCTR? To explore this issue, we propose a new segmentation-based method recognizing that is implemented using simple yet efficient fully convolutional network. A novel weakly supervised learning proposed enable network be trained only transcript annotations; thus, expensive annotations required previous can avoided. Owing lack context modeling in networks, contextual regularization integrate information into during training stage, which further improve performance. Extensive experiments conducted four widely used benchmarks, namely CASIA-HWDB, CASIA-OLHWDB, ICDAR2013, SCUT-HCCDoc, show our significantly surpasses existing both online HCTR, exhibits considerably higher inference speed than CTC/attention-based approaches.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Research of Chinese Handwritten Text Segmentation Algorithm

OCR is a complicated process, there are many factors that can influence the recognition rate. Early period people tried to optimize the classifier to obtain high recognition rate, but the premise is that there is only one character no matter print or handwritten. For the performance of classifier has been promoted a lot, recognition rate for single character is high enough for commercial use. W...

متن کامل

On-line Handwritten Japanese Text Recognition by Improving Segmentation Quality

This paper describes a method of on-line handwritten Japanese text recognition by improving segmentation quality. The method produces hypothetical segmentation points according to features such as distance and overlap between adjacent strokes. Moreover, it extracts multidimensional features from these hypothetical segmentation points and applies an SVM to the extracted features to produces segm...

متن کامل

A Hybrid Handwritten Chinese Address Recognition Approach

Handwritten Chinese Address Recognition describes a difficult yet important pattern recognition task. There are three difficulties in this problem: (1) Handwritten address is often of free styles and of high variations, resulting in inevitable segmentation errors. (2) The number of Chinese characters is large, leading low recognition rate for single Chinese characters. (3) Chinese address is us...

متن کامل

A Structural Approach for Segmentation of Handwritten Hindi Text

This paper makes an attempt to segment the handwritten Hindi words. The problem of segmentation is compounded by the possible presence of modifiers (matras) on all sides of the basic characters and due to the uncertainty introduced in the character shapes by way of different writing styles. We have devised a structural approach to capture the similarities and differences between structure class...

متن کامل

A Multi-Agent Approach to Arabic Handwritten Text Segmentation

The segmentation of individual words into characters is a vital process in handwritten character recognition systems. In this paper, a novel approach is proposed to segment handwritten Arabic text (words). We consider the “Naskh” font style. The segmentation algorithm employs seven agents in order to detect regions where segmentation is illegal. Feature points (end points) are extracted from th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Multimedia

سال: 2023

ISSN: ['1520-9210', '1941-0077']

DOI: https://doi.org/10.1109/tmm.2022.3146771